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ABSTRACT 

This study in parametric test theory deals %»ith the 
statistics of reliability estimation when scores on two parts of a 
test follow a binormal distribution with equal (case 1) or unequal 
(case 2) expectations. In each case biased maximum-likelihood 
estimators of reliability are obtained and converted into unbiased 
estimators. Sampling distributions are derived* Second moments are 
obtained and utilized in calculating mean square errors of estimation 
as a measure of accuracy. A rank order of four estimators is 
established. There is a uniformly best estimator. Tables of absolute 
and relative accuracies are provided £or various reliability 
parameters and sample sizes. (Author) 
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ON ACCUR/i.CY ni RELIABILITY ESTIMATION 



Abstract 

This study in parametric test theory deals with the statistics of 
reliability estiinaoion when scores on two parts of a test follow a bi- 
nonnal distribution with equal (case l) or unequal (case 2) expectations. 
In each case biased maximum-likelihood estimators of reliability are ob- 
tained and converted into unbiased estimators. Sampling distributions are 
derived. Second moments are obtained and utilized in calculating mean 
square errors of estimation as a measure of accuracy. A rank order of 
four estimators is established. There is a uniformly best estimator. 
Tables of absolute and relative accuracies are provided for various 
reliability parameters and sample sizes. 



ON ACCURACY IN RELIABILITY ESTIMATION 

1* Introduction and Preliminaries 

The present paper constitutes a part of a larger study in parametric 
test theory. It is devoted the development of formulas relevant in 
reliability estimation with emphasis on the accuracy of various estimates 
when explicit distributional assumptions regarding test scores are made. 
This approach also permits the derivation of statistical tests of hypotheses 
about reliability parameters. It does not limit itself to giving con- 
jectured large sample estimators and stating relationships between 
parameters . 

Probably the first paper with emphasis on small sample distributions 
of estimated reliabilities v/as \^itten by Kristof [1965]- Feldt [I9t:>5] 
also took up the subject. Other papers in the area of parametric test 
theory v/ere given by Kristof [1970, 1972]. 

Inferences about the reliability of a given test require repeated 
measurement in one form or another on a sample of subjects. Tv^o approaches 
to data collection are common: (a) one obtains multiple measurements 
using basically the same test whose reliability is the quantity of interest; 
(b) one obtains multiple measurements using comparable parts of the test 
whose reliability is the quantity of interest. 

In the second case the reliability of the component parts is stepped 
up to give the reliability of the total test- This procedure is not re- 
quired in the first case. One might, therefore^ assume that case (b) 
should lead to a statistical theory more complicated than that based on 
case (a). However, the opposite is true. Not much work with emphasis on 
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statistics has been presented for case (a). There is an important paper 
by Olkin and Pratt [1958] that bears on some of the existing problems. 
Contributions dealing with case (b) are more numerous. 

It is case (b) to which we address ourselves here also. Some of the 
results will be specirl cases of formulas in Kristof [1965]; others will be 
ne\'i. In particular, we wish to include a table of mean square errors of 
reliability estimators so that the comparative merits of several estimators 
can be assessed. It is obvious that the choice of a particular estimator ' 
out of a number of possible ones couid well be based upon a comparison of 
accuracies. 

In practice the most important instance of case (b) occurs when a 
test has been divided into just two parts. Using the classical test theory 
model and denoting total observed, true and error scores by X ^ T and 
E , respectively, and corresponding scores on the parts by X^^ , and 
E. , i = 1,2 , we write 



1 



(1) 



Let us perform the transformation 



(2) 




Y,, Xj 



As to second moments classical tost theory tells us that 



0) 



2 




0 



= a. 



E 




It has been assumed that the two parts are indistinguishable as regards 
true score variance and mean and variance of errors of measurement. It 
has not been assumed that true scores on the parts have equal means. If 
they do, we will speak of case 1; if they do not, we has/'e case 2. 
In each case the reliability of the total test is given by 

(Iv) p = a^/(a^ a|) = 1 - • 

If we supply a hat to a parameter to denote its maximum-likelihood esti- 
mator we obtain 

(5) P = 1 - • 

At this point it io necessary to introduce specific distributional assump- 
tions. It will be assumed that X^, follow a bivariate normal dis- 
tribution. 

2. Distribution of p 

2 2 

se 1. Since Y., has exxectation zero, v/e get 0 = E y-, ./^ where 



^ 1 i 

y^^ signifies the observed difference score for subject i , N being 

/v2 - 2 / 

the sample size. Further, 0,^ = I (y,,. - y^) /N where y^^^ is the ob- 

"1' i 

served value of for subject i and y^ is the arithmetic mean over 

/\2 /\2 
subjects. Then p is given by (5)* Quantity 3., is biased, 0y is 

^2 ^1 

not. Replacing N by N - 1 in ^ yields the usual unbiased vari- 

2 

ance estimator. Combination of (^0 ^^^^ gives 
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This magnitude is distributed as F with df-j^ = N ^ df^ = N - 1 . It 
would be a simple matter to obtain the distribution of p explicitly from 
(6). Hjwever^ (6) is sufficient for our purposes since it allows us to 

test hypotheses about p and to establish confidence intervals for p . 

2 - 2 2 

Case 2 , Now we have a = Z (y - y ) /n , a as before. Both 

/\2 /\2 XI c. 

and a are biased. This is immaterial^ however^ because the same 
"1 ^2 

numerator appears in both formulas. We find that the quantity 

^N-l.N-l = ^1 - pV(1 - P) 

follows an F distribution with df^ = df^ = N - 1 ^ p defined in (5)- 
It is possible to replace (7) by an equivalent but possibly more convenient 
formula if we use the fact that 

(8) t = n/^T ( ^ - l/^/F )/2 
^ ' n ^ n^n ' n^n 

follows a t distribution with df - n when F follows an F dis- 

n,n 

tribution with df^ = df^ = n . This relationship was discovered inde- 
pendently by Aroian [1953], Cacoullos [19^5] and Kristof [l972]. Thus 

(9) t^_^ - P) >^rrT /2 V(l - pKi - P) . 
Equation (9) has no equally sim.ple counterpart in case 1. 

Bias of p 

Case 1 . Taking expectations on both sides of (6) we get [Kendall & 
Stuart, 1958, p. 578] 



^^^^ Vl,N-l = ■ ^^^^ ~ P^/^^ " ^^^^ " • 

Again p is minimum variance unbiased. 

^. Variance of p 

2 

Case 1 * Knowledge of the variance of p , o^^ , will enable us to 

P 

calculate the mean square error of estimation in p . Taking variances on 
both sides of (6) we get [Kendall & Stuart, 1958, jlQ] 

(16) a| = 2N(2N - 5)(1 - P)^/(N - 5)^(N - 5) • 
Case 2 . From (?) we obtain 

(17) = if(N - 1)(N - 2)(1 - p)^/(N - 5)^(N - 5) 
by an analogous operation. 

6. Mean Square Errors of Estimation in p and p 

The mean square error of estimation, MSE, is a likely choice if a 
measure of accuracy of an estimator is sought. MSEs of p are found 
when we use 

(18) MSE = e(p - p)^ = 0^ -f (ep - p)^ 

^ with p replaced by p if the latter ^quantity is of interest. In this 

2 

case MSE = 0^ . 

P 

Magnitudes 0^ in cases 1 and 2 are given in (16) and (l7)« 

2 2 
(e^ - p) is obtained from (lO) and (ll)* Quantities may be 

P 

determined from (12 ) and (l^)» 
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After calculating the four mean square errors cf estimation we wish 

2 

to summarize the results. In each case the MSE is the product of (l - p) 
and a factor which depends only on N • These factors are listed in the 
following fourfold table: 

Case 1 Case 2 

MSE of p (l^N + 15)/(N - 5)(N - 5) + i)/(N - 5)(N - 5) 

MSE of p 2(2N - 5)/n(N - 5) - 2)/(N - l)(N - 5) 

For conciseness let us use the syrnbol MSE. . to indicate the mean 
square error of estimation in case i , i = 1,2 , for estimator of type 
J where J = 1 refers to p and J - 2 to p . We see that 

(19) MSE-^2 ^ ^^^22 ^ ^'^^21 ^^11 

for all p < 1 and N > 5 • Hence, if accuracy of estir i is the 
criterion, p in case 1 is best and p in case 1 is worst. In addition, 
p has the advantage of being unbiased* If the division of the total test 
into two parts is such that only case 2 applies, then again p is pre- 
ferred to p . At any rate, p in case 1 is the poorest possible choice 
because ^^SE^^ is largest. 

In the following table we list the ratios MSE^^/mSE^^ ^ ji^ 12 , 
for selected, sample sizes rounded to four decimal places: 
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N = 6 


10 


20 


50 


100 




1.0667 


1.o1j58 


I.02U2 


1.0099 


1.0050 


MSE2,/MSE^2 


5.1111 


1.81j86 


1.5555 


1.1187 


1.0571 


MSE^^/MSEj_2 




2.5109 


1.5105 


1.1790 


1.0859 



For small sample sizes p is generally quite inferior. If sample size 
exceeds 100, the differences tend to be negligible, however. This table 
can be easily extended if we use the relations 

MSE^^/MSE^^ = 1 + (N " 5)/(2N^ - 5N + 3) 

(20) MSE^^/mSE^^ = 1 " 9)/(2N^ - 9N + 9) 

MSE^^/mSE^2 " 1 (55N - l8)/(i^N^ - 18N + I8) 

For the best estimator, p in case 1, we give the actual values of 
MSE^ for various p and N rounded to four decimal places: 





N = 6 


10 


20 


50 


100 


p = .60 


.i^8oo 


.1088 


.0595 


.0158 


.0066 


.70 


.2700 


.0612 


.0222 


.0078 


.0057 


.80 


.1200 


.0272 


.0099 


.005if 


.0017 


.90 


.0300 


.oo6s'' 


.0025 


.0009 


.000k 


.95 


.0075 


.0017 


.0006 


.0002 


.0001 



This table reflects the rapid gain of accuracy of estimation as N and/or 
p increase. V/e see in particular that the accuracy of estimation will be 
uniformly high if N > 50 and p > .60 • It will be interesting to 



observe the trade-off occurring between p and N . For instance, the 
accuracies in estimating p O.60 when N - 100 and p = O.yO when 
N = IC are about equal. 
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